AITopics

Neural Information Processing SystemsOct-2-2025, 19:12:32 GMT

Neural Jump Stochastic Differential Equations

Equations that provide a data-driven approach to learn continuous and discrete dynamic behavior, i.e., hybrid systems that both flow and jump.

artificial intelligence, conditional intensity, machine learning, (11 more...)

Neural Information Processing Systems

Country: North America (0.28)

Industry: Banking & Finance (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Communications (0.94)

Yuchi, Henry Shaowu, Zhu, Shixiang, Dong, Li, Arisoy, Yigit M., Spencer, Matthew C.

New User Event Prediction Through the Lens of Causal Inference

arXiv.org Artificial IntelligenceJul-10-2024

Modeling and analysis for event series generated by heterogeneous users of various behavioral patterns are closely involved in our daily lives, including credit card fraud detection, online platform user recommendation, and social network analysis. The most commonly adopted approach to this task is to classify users into behavior-based categories and analyze each of them separately. However, this approach requires extensive data to fully understand user behavior, presenting challenges in modeling newcomers without historical knowledge. In this paper, we propose a novel discrete event prediction framework for new users through the lens of causal inference. Our method offers an unbiased prediction for new users without needing to know their categories. We treat the user event history as the ''treatment'' for future events and the user category as the key confounder. Thus, the prediction problem can be framed as counterfactual outcome estimation, with the new user model trained on an adjusted dataset where each event is re-weighted by its inverse propensity score. We demonstrate the superior performance of the proposed framework with a numerical simulation study and two real-world applications, including Netflix rating prediction and seller contact prediction for customer support at Amazon.

event sequence, history, sequence, (16 more...)

2407.05625

Country:

North America > United States > New York (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Services (0.68)
Law Enforcement & Public Safety > Fraud (0.54)
Media > Film (0.48)
Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Communications > Social Media (0.68)
(2 more...)

Joseph, Sobin, Jain, Shashi

Non-Parametric Estimation of Multi-dimensional Marked Hawkes Processes

arXiv.org Artificial IntelligenceFeb-7-2024

An extension of the Hawkes process, the Marked Hawkes process distinguishes itself by featuring variable jump size across each event, in contrast to the constant jump size observed in a Hawkes process without marks. While extensive literature has been dedicated to the non-parametric estimation of both the linear and non-linear Hawkes process, there remains a significant gap in the literature regarding the marked Hawkes process. In response to this, we propose a methodology for estimating the conditional intensity of the marked Hawkes process. We introduce two distinct models: \textit{Shallow Neural Hawkes with marks}- for Hawkes processes with excitatory kernels and \textit{Neural Network for Non-Linear Hawkes with Marks}- for non-linear Hawkes processes. Both these approaches take the past arrival times and their corresponding marks as the input to obtain the arrival intensity. This approach is entirely non-parametric, preserving the interpretability associated with the marked Hawkes process. To validate the efficacy of our method, we subject the method to synthetic datasets with known ground truth. Additionally, we apply our method to model cryptocurrency order book data, demonstrating its applicability to real-world scenarios.

hawke process, intensity, kernel, (15 more...)

2402.0474

Country:

Asia > India > Karnataka > Bengaluru (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.82)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Dash, Saurabh, She, Xueyuan, Mukhopadhyay, Saibal

Learning Point Processes using Recurrent Graph Network

arXiv.org Artificial IntelligenceAug-11-2022

We present a novel Recurrent Graph Network (RGN) approach for predicting discrete marked event sequences by learning the underlying complex stochastic process. Using the framework of Point Processes, we interpret a marked discrete event sequence as the superposition of different sequences each of a unique type. The nodes of the Graph Network use LSTM to incorporate past information whereas a Graph Attention Network (GAT Network) introduces strong inductive biases to capture the interaction between these different types of events. By changing the self-attention mechanism from attending over past events to attending over event types, we obtain a reduction in time and space complexity from $\mathcal{O}(N^2)$ (total number of events) to $\mathcal{O}(|\mathcal{Y}|^2)$ (number of event types). Experiments show that the proposed approach improves performance in log-likelihood, prediction and goodness-of-fit tasks with lower time and space complexity compared to state-of-the art Transformer based architectures.

dataset, node, sequence, (14 more...)

2208.05736

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningOct-16-2020

Goodness-of-Fit Test of Mismatched Models for Self-Exciting Processes

Wei, Song, Zhu, Shixiang, Zhang, Minghe, Xie, Yao

We develop a goodness-of-fit (GOF) test for generative models of self-exciting processes by making a new connection to this problem with the classical statistical theory of Quasi-maximum-likelihood estimator (QMLE). We present a non-parametric self-normalizing statistic for the GOF test: the Generalized Score (GS) statistics, and explicitly capture the model misspecification when establishing the asymptotic distribution of the GS statistic. Numerical experiments based on simulation and real-data validate our theory and demonstrate the proposed GS test's good performance.

artificial intelligence, machine learning, qmle, (19 more...)

2006.09439

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Iraq (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Asia > Japan (0.04)

Genre: Research Report (1.00)

Industry: Law Enforcement & Public Safety (0.46)

Faruqui, Syed Hasib Akhter, Alaeddini, Adel, Wang, Jing, Jaramillo, Carlos A.

A Functional Model for Structure Learning and Parameter Estimation in Continuous Time Bayesian Network: An Application in Identifying Patterns of Multiple Chronic Conditions

arXiv.org Artificial IntelligenceJul-31-2020

Abstract--Bayesian networks are powerful statistical models to study the probabilistic relationships among set random variables with major applications in disease modeling and prediction. Here, we propose a continuous time Bayesian network with conditional dependencies, represented as Poisson regression, to model the impact of exogenous variables on the conditional dependencies of the network. We also propose an adaptive regularization method with an intuitive early stopping feature based on density based clustering for efficient learning of the structure and parameters of the proposed network. Using a dataset of patients with multiple chronic conditions extracted from electronic health records of the Department of Veterans Affairs we compare the performance of the proposed approach with some of the existing methods in the literature for both short-term (one-year ahead) and long-term (multi-year ahead) predictions. The proposed approach provides a sparse intuitive representation of the complex functional relationships between multiple chronic conditions. It also provides the capability of analyzing multiple disease trajectories over time given any combination of prior conditions.

artificial intelligence, bayesian inference, machine learning, (14 more...)

2007.15847

Country:

North America > United States > Texas (0.14)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Government > Military (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningMay-15-2020

Spatio-Temporal Point Processes with Attention for Traffic Congestion Event Modeling

Zhu, Shixiang, Ding, Ruyi, Zhang, Minghe, Van Hentenryck, Pascal, Xie, Yao

We present a novel framework for modeling traffic congestion events over road networks based on new mutually exciting spatio-temporal point process models with attention mechanisms and neural network embeddings. Using multi-modal data by combining count data from traffic sensors with police reports that report traffic incidents, we aim to capture two types of triggering effect for congestion events. Current traffic congestion at one location may cause future congestion over the road network, and traffic incidents may cause spread traffic congestion. To capture the non-homogeneous temporal dependence of the event on the past, we introduce a novel attention-based mechanism based on neural networks embedding for the point process model. To incorporate the directional spatial dependence induced by the road network, we adapt the "tail-up" model from the context of spatial statistics to the traffic network setting. We demonstrate the superior performance of our approach compared to the state-of-the-art methods for both synthetic and real data.

data mining, machine learning, natural language, (19 more...)

2005.08665

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Gao, Tian, Subramanian, Dharmashankar, Shanmugam, Karthikeyan, Bhattacharjya, Debarun, Mattei, Nicholas

A Multi-Channel Neural Graphical Event Model with Negative Evidence

arXiv.org Machine LearningFeb-21-2020

Event datasets are sequences of events of various types occurring irregularly over the time-line, and they are increasingly prevalent in numerous domains. Existing work for modeling events using conditional intensities rely on either using some underlying parametric form to capture historical dependencies, or on non-parametric models that focus primarily on tasks such as prediction. We propose a non-parametric deep neural network approach in order to estimate the underlying intensity functions. We use a novel multi-channel RNN that optimally reinforces the negative evidence of no observable events with the introduction of fake event epochs within each consecutive inter-event interval. We evaluate our method against state-of-the-art baselines on model fitting tasks as gauged by log-likelihood. Through experiments on both synthetic and real-world datasets, we find that our proposed approach outperforms existing baselines on most of the datasets studied.

dataset, fake epoch, graphical event model, (15 more...)

2002.09575

Country:

South America > Argentina (0.16)
South America > Brazil (0.14)
South America > Venezuela (0.04)
(5 more...)

Genre: Research Report (0.64)

Industry:

Health & Medicine (1.00)
Government (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningJun-12-2019

Reinforcement Learning of Spatio-Temporal Point Processes

Zhu, Shixiang, Li, Shuang, Xie, Yao

Spatio-temporal event data is ubiquitous in various applications, such as social media, crime events, and electronic health records. Spatio-temporal point processes offer a versatile framework for modeling such event data, as it can jointly capture spatial and temporal dependency. A key question is to estimate the generative model for such point processes, which enables the subsequent machine learning tasks. Existing works mainly focus on parametric models for the conditional intensity function, such as the widely used multi-dimensional Hawkes processes. However, parametric models tend to lack flexibility in tackling real data. On the other hand, non-parametric for spatio-temporal point processes tend to be less interpretable. We introduce a novel and flexible semi-parametric spatial-temporal point processes model, by combining spatial statistical models based on heterogeneous Gaussian mixture diffusion kernels, whose parameters are represented using neural networks. We learn the model using a reinforcement learning framework, where the reward function is defined via the maximum mean discrepancy (MMD) of the empirical processes generated by the model and the real data. Experiments based on real data show the superior performance of our method relative to the state-of-the-art.

machine learning, point process, reinforcement learning, (14 more...)

1906.05467

Country: North America > United States > New York (0.28)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Health Care Technology > Medical Record (0.54)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)